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ABSTRACT 

Several  choice  models  applicable  to  qualitative  response  data  collected 
in  marketing  research  are  reviewed  in  this  paper.   Following  a  discussion  of 
a  general  model,  four  binary  choice  models  are  compared  in  terms  of  underlying 
choice  processes  and  methods  of  estimation.  Availability  of  computer  algorithms 
for  analysis  and  areas  of  application  in  marketing  are  also  discussed. 
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I.    Introduction 

Development  and  testing  of  models  to  describe  consumer  choice  has  been 
a  major  concern  in  marketing  and  consumer  behavior  research  [6,  16,  29].   The 
criterion  variable — consumer  choice — had  been  operationalized  in  many  ways  in 
the  literature.  Measures  employed  include  amount  bought  or  consumed  of  a  pro- 
duct or  brand,  brand  chosen,  intention  to  buy  a  brand,  preference  toward  a 
brand,  and  probability  of  brand  switching.   Usually,  however,  only  one  measure 
is  used  at  a  time  for  model  construction. 

From  a  technical  viewpoint,  measures  of  consumer  choice  belong  to  the 
three  basic  scales  of  measurement,  namely,  interval,  ordinal  or  categorical. 
Methods  of  analysis  associated  with  these  measures  have  respectively  been  mul- 
tiple regression,  ordinal  regression,  and  two-group  or  multiple  discriminant 
analysis  [8,  12]. 

The  focus  of  this  paper  is  on  models  when  the  consumer  choice  is  measured 
on  a  categorical  scale.   This  scale  represents  a  variety  of  consumer  choice 
situations  such  as  buying  or  not  buying  a  brand,  viewing  or  not  viewing  tele- 
vision, buying  a  gift  or  not,  an  industrial  buyer  seeing  a  salesman  or  not, 
etc.   In  addition,  the  scale  can  also  represent  particular  choices  made  within 
a  set  of  alternatives  such  as  brands  of  a  product  category,  prime  time  tele- 
vision programs,  television  news  programs,  and  suppliers  of  an  industrial 
product. 

Even  when  the  measure  of  consumer  choice  is  not  categorical,  it  can  easily 
be  converted  to  that  scale  by  a  suitable  regrouping  (or  collapsing)  of  the 
original  scale.   For  example,  consumers  can  be  classified  as  heavy  or  light 
on  the  basis  of  amount  of  reported  consumption  measured  on  an  interval  scale. 
Such  a  conversion  offers  a  potential  advantage  of  reducing  the  errors  asso- 
ciated with  data  collection.   In  addition,  the  concept  of  finally  using  a 
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categorical  scale  would  make  it  easier  to  collect  such  data  in  the  first  place 
(as  opposed  to  later  conversion) . 

Despite  the  apparent  niceties  of  the  qualitative  response  variable  (i.e., 
categorical  scaled  data) ,  much  of  the  model  building  of  consumer  choice  has 
largely  concentrated  on  measures  of  interval  or  ordinal  scale.  When  the  data 
are  categorical,  researchers  usually  utilize  chisquare  (contingency)  analysis 
or  multivariate  discriminant  analysis.   An  application  of  the  multivariate 
probit  model  for  purchasing  decisions  of  farmers  can  be  found  in  [17].   It  is 
only  recently  that  other  models,  namely,  logit  and  log-linear,  are  proposed 
and  used  in  marketing  research  [11].   The  emphasis  of  this  application  is  on 
contingency  table  analysis  in  contrast  to  model  building  of  consumer  choice, 
per  se.   These  two  models  are  relatively  simple  and  quite  well-known,  but 
not  much  used  in  marketing  prior  to  this  application. 

During  the  last  five  years  or  so,  there  has  been  a  renewed  interest  in 
the  analysis  and  modeling  of  qualitative  dependent  variables  among  econometri- 
cians.   The  interest  apparently  arose  due  to  the  need  to  look  at  consumption 
data  such  as  the  transportation  mode  choices  and  the  inadequacy  of  using  or- 
dinary least  squares  analysis  on  a  qualitative  dependent  variable  owing  to 
heteroskedasticity  [10].   This  recent  effort  gave  rise  to  extensions  of  tech- 
niques such  as  the  probit  and  logistic  models  and  associated  computer  algorithms. 

Against  this  background,  the  objective  of  tliis  paper  is  to  review  some 
alternative  models  of  consumer  choice  applicable  to  qualitative  responses. 
Specifically,  we  v/ili  consider  four  models:   (1)  discriminant  model,  (2)  linear 
probability  model,  (3)  multivariate  probit  model,  and  (4)  multivariate  logit 
model.   For  sake  of  simplicity,  we  will  only  consider  binary  choice  situations 
in  some  parts  of  the  discussion. 
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The  remainder  of  this  paper  is  organized  into  five  additional  sections. 
In  the  next  and  second  section,  we  present  the  notation  and  the  general  pro- 
blem of  modeling  consumer  choice  using  qualitative  response  data  and  one  par- 
ticular case  leading  to  the  above-mentioned  four  models.   The  third  section 
describes  briefly  the  methods  of  estimation  for  four  binary  choice  models. 
The  problem  of  measuring  the  effect  of  changes  in  the  independent  variables 
(e.g.,  characteristics  of  consumers  or  choice  alternatives)  on  the  probability 
of  choosing  an  alternative  for  each  model  is  considered  in  the  fourth  section. 
A  brief  review  of  the  computer  programs  available  for  analysis  of  data  ac- 
cording to  these  models  is  presented  in  the  fifth  section.  We  conclude  the 
final  section  with  a  discussion  of  potential  applications  in  marketing  and 
some  research  issues  with  these  models. 

II.    A  General  Model  for  Qualitative  Responses 

The  problem  of  modeling  qualitative  responses  of  consumer  choice  from  the 
econometric  perspective  has  been  reviewed  by  McFadden  [20,  22].   When  the 
choice  is  binary,  the  work  by  Cox  [4]  is  relevant.   Other  references  from  a 
theoretical  point  of  view  include  [14,  24,  26,  30].  While  we  do  not  wish  to 
trace  through  the  historical  origins  of  the  subject,  mention  should  be  made 
of  the  pioneering  v:ork  on  probit  analysis  by  Finney  [7].   Some  applications 
in  areas  other  than  marketing  are  found  in  [5,  13,  27,  32,  33].   In  the  sequel, 
we  will  adapt  much  of  this  literature  as  it  relates  to  the  problem  of  modeling 
qualitative  responses  of  consumer  choice  in  marketing. 

To  see  the  relevance  of  the  problem  to  marketing  and  consumer  research, 
consider  the  following  situation.   Imagine  observing  a  sample  of  consumers 
choosing  one  of  many  brands  in  a  product  category  under  a  set  of  different 
choice  situations  or  scenarios  during  a  given  period  of  time.   Assume  further 
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that  it  is  possible  to  observe  at  least  one  choice  for  each  consumer  under 
each  sitiaation  in  this  period.   (Of  course,  many  of  these  replications  may 
be  only  one.)   The  data  observed,  namely,  the  brand  chosen,   are  then  the 
responses  for  each  consumer  and  are  qualitative.   The  response  is  related  to 
the  characteristics  of  consumers,  characteristics  of  brands,  and  the  charac- 
teristics of  the  situation.   The  problem  of  modeling  the  qualitative  responses 
deals  with  the  specification  of  the  form  of  the  function  (f)  relating  the  var- 
ious characteristics  to  the  probability  of  response  for  each  brand.   The 
methods  of  estimatation  are  concerned  V7ith  the  determination  of  the  parameters 

of  f  as  dependent  on  the  availability  of  replications  of  observations  and  the 

3 
number  of  brands.   We  will  assume  that  there  exists  no  order  in  the  responses 

(i.e.,  brands  are  not  ordered). 

Notation.   We  will  adopt  the  following  notation  to  present  the  model  in 

a  formal  manner.   For  simplicity,  we  will  consider  the  case  of  one  choice 

situation. 

m  =  number  of  consumers 

n  =  number  of  brands 

R.  =  number  of  replications  observed  under  the  situation  for  the  ith 

consumer;  i=l,2,...,ra;  (R,  >  1) 

J  =  set  of  possible  responses  for  any  replication  (i.e.,  set  of  n  brands) 

r  =  number  of  attributes  of  the  brands 

s  =  number  of  characteristics  of  the  consumer 

XB.  =  r-dimensional  vector  of  attributes  for  the  jth  brand;  (j=l,2, . . . ,n) 

XC,  =  s-diraensional  vector  of  characteristics  for  the  ith  consumer;  (i=l,2, . . . ,m) 

3  =  r-dimensional  parameter  vector  associated  with  the  brand  attributes 

Y  =  s-dimensional  parameter  vector  associated  with  consumer  characteristics 
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F . ,  =  observed  frequency  with  which  brand  j  is  chosen  across  all  repli- 
cations by  the  ith  consuraer  (i=l,2, . . . ,m;  j=l,2,...,n) 

P..  =  observed  probability  of  choice  of  the  ith  brand  (P^ .  =  F../R.; 

i=l,2,. . .,m;  j=l,2, . . . ,n) 

Y.  =  n-dimensional  vector  of  probabilities  (P ,,  ,P ,_,...  ,P,  )  of  choice 
i  il  1.2      in 

of  the  n  brands  for  the  ith  consumer;  i=l,2,...,m.   (If  there  is 
only  one  replication,  then  Y.  will  contain  n-1  zeroes  and  one  unity.) 
a  =  a  constant  parameter. 

Models .   The  theory  of  qualitative  responses  postulates  the  existence  of 
an  indicator  variable,  denoted  by  I,  which  takes  on  different  values  across 
various  brands  for  a  given  consumer.   In  general,  it  is  assumed  to  be  a  func- 
tion of  variables  XB  and  XC.  Much  of  the  modeling  work  involves  specification 
of  the  functional  form  for  I.   A  convenient  starting  point  is  the  linear  model 
such  as : 

I  =  a  +  3'XB  +  y'XC  .  (1) 

This  form  can  be  easily  extended  to  include  within-set  interactions  among  the 

attributes  of  brands  or  characteristics  of  consumers  as  well  as  betv/een  set 

interactions.   Generally,  however,  such  specification  should  be  guided  by  the 

substantive  nature  of  the  choice  problem  being  modeled. 

Further,  the  consumer  is  assumed  to  have  threshold  values  on  the  indicator 

scale  which  lead  to  the  choices  of  various  brands.   Assumption  of  a  particular 

probability  distribution  for  the  threshold  values  would  then  generate  a  set  of 

theoretical  choice  probabilities  (-fr , ,  ,tt^  „  , .  .  .tt  ,  )  for  the  ith  person  which  are 
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functions  of  parameters  associated  with  brands  and/or  persons.   The  observed 

frequencies  (F. ^ ,F. . , . . . ,F.  )  can  then  be  assumed  to  arise  from  a  multinominal 
il  x2      m 

distribution  with  these  theoretical  probabilities.   The  parameters  can  be  es- 
timated using  maximum  likelihood  methods  or  least  squares  methods. 
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This  approach  is  indeed  complicated  for  the  general  case.   Several  simpli- 
fications occur,  however,  for  special  cases.   In  particular,  we  will  consider 
the  binary  choice  case  (i.e.,  n=2)  to  compare  the  above-mentioned  four  models 
for  the  situation  of  one  replication.   This  situation  is  highly  appropriate 
in  marketing  where  much  of  the  analysis  deals  with  cross-sectional  data.   Such 
data  come  closest  to  the  case  of  one  replication.   Further,  the  binary  choice 
analysis  can  be  repeated  to  model  the  choices  with  respect  to  each  brand  in 
the  choice  set. 

Case  n=2.   Here,  there  are  only  two  choice  alternatives.   Therefore,  we 
can  reduce  the  vector  variable  Y  to  a  scalar  variable  by  considering  only  the 
probabilities  for  one  of  the  two  brands.   Such  reduction  would  preserve  all 
of  the  information  in  the  data  for  the  case  of  one  replication.   Further,  the 
reduced  variable  is  either  1  or  0. 

The  index  can  be  written  simply  in  terms  of  the  consumer  specific  vari- 
ables.  Thus,  the  model  would  become: 

I  =  a  +  y'XC  .  (2) 

Let  I.  denote  the  threshold  value  specific  to  the  ith  consumer.  The 
four  models — discriminant  model,  linear  probability  model,  multivariate  probit 
model,  and  multivariate  logistic  model — would  result  from  different  assumptions 
on  the  probability  distributions  for  the  threshold  values,  I  .   These  are  shown 
in  Table  1.   The  reader  should  note  that  different  assumptions  are  also  involved 


Insert  Table  1  About  Here 


with  respect  to  the  threshold  values  across  consumers.   We  have  shown  an  ex- 
tremely simplified  conceptualization  of  the  discriminant  model  in 
order  to  keep  the  assumptions  to  a  minimum.   The  general  two-group  discriminant 
analysis  model  would  follow  when  we  assume  multivariate  normal  distribution 
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for  the  consumer-specific  variables  (XB)  and  equal  covariance  matrix  for  the 
two  groups  of  consumers  respectively  choosing  the  two  brands.  See  multivar- 
iate texts  by  Anderson  [1],  l-Iorrison  [23]  or  Press  [25]  for  a  discussion  of 

these.   In  order  to  adapt  this  analysis  to  the  modeling  of  qualitative  re- 

h 
sponses,  we  indeed  need  additional  knowledge  on  the  prior  belonging  of  the 

consumer  to  the  groups  of  buyers  or  nonbuyers  of  the  brand. 

III.   Estimation  of  Parameters  for  Four  Binary  Choice  Models 

Table  2  reviews  various  methods  of  estimation  appropriate  to  the  case  of 
single  replication  (i.e.,  when  y  is  either  0  or  1)  for  the  four  binary  choice 
models  under  comparison.  It  also  shows  the  major  problems  with  the  procedure 
and  properties  of  estimates.   The  methods  are  based  on  variations  of  least 
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squares  method  or  maximum  likelihood  procedure.  Two  additional  comments  may 
be  in  order.   First,  the  maximum  likelihood  method  can  be  employed  for  any 
probability  distribution  prespecified  for  the  underlying  choice  process  of 
the  consumer.   Second,  we  have  only  covered  one  method  of  estimating  the 
parameters  of  the  discriminant  model;  for  others,  see  [l,  23,  25]. 

When  the  replications  are  more  than  one,  several  other  methods  could  be 
employed.   One  of  these  [3]  involves  converting  the  observed  probability  into 
its  logit,  i.e.,  log  [P/(I-P)j,  expanding  it  as  a  Taylor  series  in  terms  of 
the  parameters  to  be  estimated  and  using  least  squares  method  of  estimation. 
Some  modifications  to  this  method  are  possible  in  order  to  improve  its  accu- 
racy [^4,  32].   Empirical  comparison,s  of  various  methods  discussed  in  this 
section  as  applied  to  Monte  Carlo  and  raal  data  can  be  found  in  [5,  24]. 
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IV.   Response  Effects  of  Changes  in  Independent  Variables 

We  will  briefly  consider  the  effect  of  changes  in  consumer  characteristics 
on  the  theoretical  probability  of  choosing  the  brand  according  to  each  model. 
These  measures  are  useful  in  forecasting  the  demand  for  a  brand  due  to  changes 
in  consumer  characteristics  and  also  in  the  development  of  strategies  to  in- 
fluence choice.   Additionally,  knowledge  of  the  response  coefficients  could 
be  valuable  in  testing  the  accuracy  of  alternative  formulations  of  the  choice 
process. 

Simply  stated,  these  are  9-ir,/9XC,  iv'here  tr.  is  the  theoretical  probability 
of  choosing  one  brand  (in  the  binary  situation)  and  XC,  is  the  kth  measured 
characteristic  of  the  ith  consumer.   It  is  computed  using  the  relationship: 

Stt.     Sit.    31, 

L-  =  — £  .    ^  (3) 

3XC,      31.    3XC,  ^   ■' 

k       X      k 

where  I  is  the  Indicator  for  the  ith  consumer.   The  response  coefficients 

computed  using  equation  (3)  are  summarized  below  for  each  model.   Of  course, 

to  be  correct,  one  needs  to  take  into  account  the  fact  that  probability  cannot 

exceed  unity  for  the  discriminant  model. 

Model  Response  Coefficient 

Discriminant  Model  v, 

k 

Linear  Probability  Model       .{V^^-^^>  ^^  ^i  ^  ^^'^^ 

0  otherwise 

Multivariate  Probit  Model      $(I^)■Yl  where  q)(-)  is  the  unit  normal 

density  function 

Multivariate  Logit  Model       tr .  (1-ti  .)y,  where  it.  is  the  theoretical 
"  1    i  'k        1 


value  of  probability  at  1. ■ 


While  the  value  of  the  response  coefficient  Is  uniformly  the  same  for  the  dis- 
criminant model  and  the  linear  probability  model  (except  for  the  end  zones), 
it  depends  upon  the  location  of  the  indicator  variable  for  the  probit  and  logit 
models.   In  the  absence  of  the  knowledge  of  the  true  underlying  model,  it  is 
difficult  to  choose  between  these  coefficients  in  practice.   Empirical  evidence 
and  accuracy  of  predictive  testing  are  some  ways  to  resolve  this  issue.   In 
fact,  Haberman  {14,  p.  311]  claims: 

...that  no  empirical  evidence  exists  than  the  normal 
distribution  provides  more  accurate  models  than  the 
logistic  distribution.   Theoretical  arguments  have 
been  advanced  which  favor  one  or  the  other  distri- 
bution, but  none  of  them  appears  convincing,  at  least 
to  the  author. 

V.   Computational  Algorithms 

Several  computer  programs  exist  for  implementing  these  models.  We  will 
briefly  describe  four  of  these:  (a)  Generalized  Chi-square  Analysis  of  Cate- 
gorical data  using  a  weighted  least  squares  program  which  has  the  acronym  GENCAT 
[18];  (b)  Multiple  Logistic  Program  due  to  Duncan  and  Walker  [15,  32];  (c)  Mul- 
tivariate dichotomous  variable  program  [24];  and  (d)  Conditional  logit  multi- 
nominal  estimation  program  called  XLOGIT  [22,  34],   Our  comments  on  these  will 
be  necessarily  very  brief. 

(a)  GENCAT  Program:   This  program  implements  the  analysis  of  multivariate 
categorical  data.   It  enables  estimation  of  functions  to  describe  observed  pro- 
portions in  terms  of  several  descriptor  variables  using  a  weighted  least  squares 
method.   It  also  computes  several  statistics  for  testing  hypotheses  on  the 
functional  forms  of  the  relationships. 

(b)  Multiple  Logistic  Program:  This  program  implements  the  method  deve- 
loped by  Duncan  and  Walker  for  estimating  the  probability  of  occurrence  of  an 
event  from  dichotomous  or  polychotomous  data.   A  recursive  technique  is  used 
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in  estimating  the  niultiple  logistic  risk  function  in  accordance  with  maximuin 
likelihood  methods.  The  program  also  computes  the  linear  discriminant  func- 
tion for  obtaining  initial  estinites  in  the  iterative  process. 

(c)  Multivariate  Dichotomous  Variable  Program:  This  program  implements 
log-linear  and  logistic  models  for  upto  four  jointly  dependent  dichotomous 
variables  using  maximum  likelihood  methods.   Its  special  features  include 
ability  to  study  the  bivariate  interactions  of  the  exogenous  explanatory 
variables. 

(d)  XLOGIT  Program;   This  program  implements  the  estimation  of  the  con- 
ditional logit  multinomial  model  using  maximum  likelihood  procedures.   Esti- 
mation is  carried  out  by  standard  unconstrained  maximization  procedures.  While 
we  have  not  described  the  theory  of  this  procedure  in  this  paper,  the  program 
can  be  employed  for  estimating  the  binary  choice  models. 

VI.   Conclusions 

It  should  be  clear  from  the  foregoing  discussion  that  there  exists  a  sig- 
nificant body  of  knowledge  on  the  qualitative  response  models  and  that  it  per- 
tains almost  exclusively  to  areas  other  than  marketing.   Researchers  in  mar- 
keting and  consumer  behavior  could  po.<5sibly  benefit  from  a  close  scrutiny  of 
the  theory  and  analysis  methods  currently  available  in  the  literature. 

While  we  have  largely  concentrated  on  the  binary  choice  models,  theory 
and  estimation  methodology  extend  to  the  polytomous  qualitative  variable. 
Multiple  response  variables  can  also  be  studied  in  this  framework. 

Obviously,  these  models  need  to  be  subjected  to  validation  and  testing. 
Opportunities  exist  for  predictive  testing  using  behavioral  experimental 
techniques.   The  resolution  as  to  which  model  to  use  and  which  method  of 
estimation  can  only  result  from  extensive  application  and  research  on  the 
underlying  choice  process. 
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Nevertheless,  various  applications  are  possible  in  marketing  and  consuaer 
areas.  We  will  briefly  touch  upon  three  directions:   (i)  direct  applications 
of  the  binary  choice  models  reviewed;  (ii)  application  to  the  decision  processes 
of  one  consumer  toward  a  set  of  brands  or  concepts;  and  (iii)  study  of  longi- 
tudinal choice  behavior. 

Direct  applications  of  binary  choice  models  include  a  study  of  choice 
behavior  toward  brands,  services,  television  programs,  shopping  centers,  stores 
and  the  like.   Emphasis  here  would  be  to  fit  models  to  cross-sectional  data  and 
estimate  response  coefficients  to  changes  in  characteristics  of  the  population 
of  consumers.   Further,  future  demand  can  also  be  estimated.   Differences  among 
prespecified  segments  can  be  studied  by  fitting  models  to  samples  of  consumers 
in  each  segment.   Another  application  would  be  to  study  the  response/nonresponse 
behavior  in  survey  research. 

The  general  model  can  be  applied  to  describe  the  choice  process  of  one  con- 
sumer tov/ard  a  set  of  brands  or  product  concepts.   This  is  the  case  when  m=l. 
Such  a  situation  is  prototjnpical  of  the  data  collected  in  concept  testing 
studies  using  such  methods  as  conjoint  measurement.   The  response  here  would 
be  "no"  or  "yes"  with  respect  to  buying  the  brand  represented  by  the  concept 
(or  some  other  criterion).   In  this  case,  the  model  would  be  I=a+3'XB.   The 
model  can  be  fitted  to  data  for  each  consumer,  thereby  enabling  an  examination 

of  individual  differences  in  the  response  coefficients  for  changes  in  the 
brand  attributes.   Rao  and  Winter  [28]  present  an  application  of  this  approach 
to  the  issue  of  product  design  and  market  segmentation.   The  general  qualita- 
tive response  models  can  be  employed  to  extend  current  approaches  to  modeling 
comparative  and  categorical  judgmental  data  [31]. 

The  methodology  can  also  be  used  to  analyze  panel  data.   The  problem  here 
would  be  to  estimate  the  transition  probabilities  from  one  time  period  to  the 
next  using  these  models  and  compare  them  to  known  stochastic  models  [2,  19]. 
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FOOTNOTES 

VJhile  canonical  correlation  is  an  appropriate  raethod  for  models  with 
multiple  measures,  its  use  has  been  insignificant  owing  to  difficulty  of 
interpreting  results. 


2 
Other  data  such  as  amount  bought  could  also  be  treated  using  these 

models  by  appropriate  discretization. 


3 

The  problem  of  modeling  responses  that  are  either  sequentially  obtained 

or  ordered  in  any  manner  is  more  complicated  and  is  beyond  the  scope  of  this 
paper. 


4 

In  fact,  possibilities  exist  for  combining  a  discriminant  analysis  model 

with  logit  analysis  for  purposes  of  estimation;  see  [21]. 


These  do  not  include  the  quadratic  programming  algorithms  applicable  to 
the  linear  probability  model. 


.!  (;. 
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TABLE  1 

Some  Assumptions  and  Probability  of 
Choice  for  Four  Binary 
Choice  Models 


Model 


Assumed  Probability  Distri- 
bution for  Threshold  Value 


Probability  of  Choosing 
Brand  1  for  Consumer  i 


Discriminant 
Model* 


Single  point  distribution  with 
vfhole  mass  at  I^ 


0  if  I  >  I. 

c  —  1 

1  If  I  <  I. 

c 


Linear  Probability 
Model 


Uniform  distribution  in  the 
interval  (a,b) 


Varies  linearly  with  I. 
in  the  interval  (a,b) 

fO   if   I^  1  a 

(I^-a)/(b-a)    if  a<I^<b 


ll   if   I.    >  b 


X  — 


Multivariate  Probit 
Model 


Nortaal  probability  distri- 
bution;  'sCO  is  the  cumul- 
ative density  function 


^(1^) 


Multivariate  Logit 
Model 


Logistic  probability 
function;  f(x)=  exp(-x)/ 
{l+exp(-x)}2 


{1  +  exp(-l^)}" 


See  text  for  elaboration. 


TABLE  2 

Estiroation  Methods  and  Properties  of 
Estimates  for  Four  Binary  Choice  Models 
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Model 


Estimation 
Method 


Major  Problems  with 
the   Method 


fPropertles  of 
Estimates 


(a)  Discriminant 
Model* 


(b)  Linear 

Probability 

Model 


(c)  Multivariate 

Probit 
Model 


(d)  Multix'ariate 
Logit 
Model 


Weighted  least 
squares ,   usually 
a  two-step  pro- 
cedure . 


Quadratic  program- 
ming to  minitnize 
squared  error  sub- 
ject to  inequality 
constraints  (e.g., 
Dantzlg-Cottle 
Aigorlthm) . 


Haxsiir-uiii  likeli- 
hood method;  in- 
volves solution  of 
nonlinear  equations 
using  it-irative 
methods  (e.g. , 
Newtcu-Raphson 
method) - 


Same  as  for  (c) 


Prediction  of  y  could 

lie  out  (C,i)  interval. 

Extreme  values  of  y 

predictions  could  be 

biased. 

Estimates  are  sensitive 

to  specification  error. 


Unbiased, 
consistent 


I 


1.  Very  costly  to  implement; 

2.  Extreme  value  bias 
exists  in  prediction. 

3.  Sensitive  to  specifi- 
cation error. 


1.  Very  costly  to  imple- 
ment 

2.  Need  fairly  large 
samples 


Same  as  for  (c) 


Consistent; 
not  unbiased, 
but  estimates 
tend  to  be 
distributed 
tightly  about 
true  values. 


Consistent, 
not  unbiased, 
efficient. 


Same  as  for  (c) 


This  is  not  the  same  metliod  used  in  standard  packages  for  discriminant  analysis. 
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